翻訳と辞書
Words near each other
・ Tehran Imam Khomeini International Airport
・ Tehran International Airport
・ Tehran International Book Fair
・ Tehran International Conference on Disarmament and Non-Proliferation, 2010
・ Tehran International Puppet Theatre Festival
・ Tehran International School
・ Tehran International Tower
・ Tehran Japanese School
・ Tehran Jewish Committee
・ Tehran Metro
・ Tehran Metro Line 1
・ Tehran Metro Line 2
・ Tehran Metro Line 3
・ Tehran Metro Line 4
・ Tehran Metro Line 5
Tehran Monolingual Corpus
・ Tehran Monorail
・ Tehran Museum of Contemporary Art
・ Tehran peace museum
・ Tehran Province
・ Tehran Province League
・ Tehran Psychiatric Institute
・ Tehran railway station
・ Tehran School of Political Science
・ Tehran Stock Exchange
・ Tehran Stock Exchange Services Company
・ Tehran Symphony Orchestra
・ Tehran Times
・ Tehran University Medical Journal
・ Tehran University of Art


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Tehran Monolingual Corpus : ウィキペディア英語版
Tehran Monolingual Corpus

The Tehran Monolingual Corpus (TMC) is a large-scale Persian monolingual corpus. TMC is suited for Language Modeling and relevant research areas in Natural Language Processing.
The corpus is extracted from Hamshahri Corpus and ISNA news agency website. The quality of Hamshahri corpus is improved for language modeling purpose by a series of tokenization and spell-checking steps.
TMC comprises more than 250 million words. The total number of unique words (with frequency of two or more) of the corpus is about 300 thousand, which is relatively good for a highly-inflectional language like Persian.
TMC is created by Natural Language Processing Lab of University of Tehran. The corpus is free for research use, after obtaining permission from the corpus aggregator.
==See also==

* TEP: Tehran English-Persian parallel corpus
* Hamshahri Corpus

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Tehran Monolingual Corpus」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.